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"There are at least 'several hundred' incompetents now in the school system [says 
the superintendent]. Other observers think there are several thousands, while still 
others insist that 'several' would be nearer the mark. Whether these incompetents 
were unfit to teach at any time, or have been rendered unfit by the passing years, 
is a matter of opinion. The question is, why are they allowed to remain?"' 



So wrote The New York Times — in 1936. 

In the 7 3 years since, we have made little progress toward 
answering the question of why poor instruction in our 
schools goes unaddressed. The question has been the 
subject of vigorous discussion, but most commentary has 
attempted to answer it by debating the failure of school 
districts to dismiss teachers who perform poorly. 

The contours of this debate are well-known. One side 
claims that teacher tenure and due process protections 
render dismissal a practical impossibility, shielding 
ineffective teachers from removal in all but the most 
egregious instances. The other argues that the process 
provides only minimal protection against arbitrary or 
discriminatory dismissal, but that administrators fail to 
document poor performance adequately and refuse to 
provide struggling teachers with sufficient support. 

For decades these positions have remained largely unchanged. 

The established arguments, however, fail to recognize 
that the challenge of addressing performance in the 
teaching profession goes far beyond the issue of dismissal. 
In fact, as this report illustrates, school districts fail to 
acknowledge or act on differences in teacher performance 
almost entirely. When it comes to officially appraising 
performance and supporting improvement, a culture 
of indifference about the quality of instruction in each 
classroom dominates. 

Our research confirms what is by now common 
knowledge: tenured teachers are identified as ineffective 
and dismissed from employment with exceptional 
infrequency. While an important finding in its own 
right, we have come to understand that infrequent 
teacher dismissals are in fact just one symptom of a 
larger, more fundamental crisis — the inability of our 
schools to assess instructional performance accurately 
or to act on this information in meaningful ways. 



This inability not only keeps schools from dismissing 
consistently poor performers, but also prevents them 
from recognizing excellence among top-performers or 
supporting growth among the broad plurality of hard- 
working teachers who operate in the middle of the 
performance spectrum. Instead, school districts default to 
treating all teachers as essentially the same, both in terms 
of effectiveness and need for development. 

Of course, as teachers themselves are acutely aware, 
they are not at all the same. Just like professionals in 
other fields, teachers vary. They boast individual skills, 
competencies and talents. They generate different 
responses and levels of growth from students. 

In a knowledge-based economy that makes education 
more important than ever, teachers matter more 
than ever. This report is a call to action — to policy- 
makers, district and school leaders and to teachers and 
their representatives — to address our national failure 
to acknowledge and act on differences in teacher 
effectiveness once and for all. To do this, school districts 
must begin to distinguish great from good, good from fair, 
and fair from poor. Effective teaching must be recognized; 
ineffective teaching must be addressed. 

Recently, President Obama spoke in bold terms about 
improving teacher effectiveness in just this way, saying, 

“If a teacher is given a chance or two chances or three 
chances but still does not improve, there is no excuse 
for that person to continue teaching. I reject a system 
that rewards failure and protects a person from its 
consequences. The stakes are too high. We can afford 
nothing but the best when it comes to our children’s 
teachers and the schools where they teach.” 2 We could 
not agree more. It is our hope that the recommendations 
contained in this report will outline a path to a better future 
for the profession. 



FOREWORD 



EXECUTIVE SUMMARY 
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A teacher's effectiveness — the most important factor 
for schools in improving student achievement — is 
not measured, recorded, or used to inform decision- 
making in any meaningful way. 




EXECUTIVE SUMMARY 



Suppose you are a parent determined to make sure your child gets the best possible 
education. You understand intuitively what an ample body of research proves: that your 
child’s education depends to a large extent on the quality of her teachers. Consequently, 
as you begin considering local public schools, you focus on a basic question: who are the best 
teachers, and where do they teach? 

The question is simple enough. There’s just one problem — except for word of mouth from other 
parents, no one can tell you the answers. 

In fact, you would be dismayed to discover that not only can no one tell you which teachers are 
most effective, they also cannot say which are the least effective or which fall in between. Were 
you to examine the district’s teacher evaluation records yourself, you would find that, on paper, 
almost every teacher is a great teacher, even at schools where the chance of a student succeeding 
academically amounts to a coin toss, at best. 

In short, the school district would ask you to trust that it can provide your child a quality 
education, even though it cannot honestly tell you whether it is providing her a quality teacher. 

This is the reality for our public school districts nationwide. Put simply, they fail to distinguish 
great teaching from good, good from fair, and fair from poor. A teacher’s effectiveness — the most 
important factor for schools in improving student achievement — is not measured, recorded, or 
used to inform decision-making in any meaningful way. 
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The Widget Effect 

This report examines our pervasive and longstanding failure to recognize and respond to 
variations in the effectiveness of our teachers. At the heart of the matter are teacher evaluation 
systems, which in theory should serve as the primary mechanism for assessing such variations, 
but in practice tell us little about how one teacher differs from any other, except teachers whose 
performance is so egregiously poor as to warrant dismissal. 

The failure of evaluation systems to provide accurate and credible information about individual 
teachers’ instructional performance sustains and reinforces a phenomenon that we have come to 
call the Widget Effect. The Widget Effect describes the tendency of school districts to assume 
classroom effectiveness is the same from teacher to teacher. This decades-old fallacy fosters an 
environment in which teachers cease to be understood as individual professionals, but rather as 
interchangeable parts. In its denial of individual strengths and weaknesses, it is deeply disrespectful 
to teachers; in its indifference to instructional effectiveness, it gambles with the lives of students. 

Today, the Widget Effect is codified in a policy framework that rarely considers teacher 
effectiveness for key decisions, as illustrated below. 



Where Is Performance a Factor in Important Decisions About Teachers?* 
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The fact that information on teacher performance is almost exclusively used for decisions related 
to teacher remediation and dismissal paints a stark picture: In general, our schools are indifferent 
to instructional effectiveness — except when it comes time to remove a teacher. 



See “Policy Implications of the Widget Effect” for additional information 
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Study Overview 

This report is the product of an extensive research effort spanning 1 2 districts and 
four states. It reflects survey responses from approximately 15,000 teachers and 
1 ,300 administrators, and it has benefited from the insight of more than 80 local 
and state education officials, teachers union leaders, policymakers and advocates who 
participated in advisory panels in each state, shaping the study design, data collection 
instruments, and findings and recommendations. 

The four states included in the study, Arkansas, Colorado, Illinois and Ohio, employ 
diverse teacher performance management policies. The 1 2 districts studied range in 
size, geographic location, evaluation policies and practices and overall approach to 
teacher performance management. Jonesboro Public Schools, the smallest district 
studied, serves approximately 4,450 students; Chicago Public Schools, the largest, 
serves 413,700. All 12 districts employ some formal evaluation process for teachers, 
but the methods and frequency of evaluation differ. The outcomes, however, are 
strikingly similar. 



Study Sites* 




AR 1 


CO 


1 IL 


| OH 


El Dorado Public Schools 
Jonesboro Public Schools 
Little Rock School District 
Springdale Public Schools 


Denver Public Schools 
Pueblo City Schools 


Chicago Public Schools 
District U-46 (Elgin) 
Rockford Public Schools 


Akron Public Schools 
Cincinnati Public Schools 
Toledo Public Schools 



*For more information on the study sites, please see Methodology. 





Characteristics of the Widget Effect in Teacher Evaluation 

The Widget Effect is characterized by institutional indifference to variations in teacher performance. 
Teacher evaluation systems reflect and reinforce this indifference in several ways. 



All teachers are rated good or great 
In districts that use binary evaluation ratings (generally 
“satisfactory” or “unsatisfactory”), more than 
99 percent of teachers receive the satisfactory rating. 
Districts that use a broader range of rating options do 
little better; in these districts, 94 percent of teachers 
receive one of the top two ratings and less than 
1 percent are rated unsatisfactory. 

Excellence goes unrecognized 
When all teachers are rated good or great, those who 
are truly exceptional cannot be formally identified. 
Fifty-nine percent of teachers and 63 percent of 
administrators say their district is not doing enough 
to identify, compensate, promote and retain the most 
effective teachers. 

Inadequate professional development 
The failure to assess variations in instructional 
effectiveness also precludes districts from identifying 
specific development needs in their teachers. In 
fact, 73 percent of teachers surveyed said their most 
recent evaluation did not identify any development 
areas, and only 45 percent of teachers who did have 
development areas identified said they received useful 
support to improve. 



No special attention to novices 
Inattention to teacher performance and development 
begins from a teacher’s first days in the classroom. 
Though it is widely recognized that teachers are 
least effective in their beginning years, 66 percent 
of novice teachers in districts with multiple ratings 
received a rating greater than “satisfactory” on their 
most recent performance evaluation. Low expectations 
characterize the tenure process as well, with 4 1 percent 
of administrators reporting that they have never “non- 
renewed” a probationary teacher for performance 
concerns in his or her final probationary year. 

Poor performance goes unaddressed 
Despite uniformly positive evaluation ratings, teachers and 
administrators both recognize ineffective teaching in their 
schools. In fact, 81 percent of administrators and 57 percent 
of teachers say there is a tenured teacher in their school 
who is performing poorly, and 43 percent of teachers say 
there is a tenured teacher who should be dismissed for poor 
performance. Troublingly, the percentages are higher in 
high-poverty schools. But district records confirm the 
scarcity of formal dismissals; at least half of the districts 
studied did not dismiss a single non-probationary teacher 
for poor performance in the time period studied (ranging 
from two to five years in each district). 
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Flaws in Evaluation Practice and Implementation 

The characteristics above are exacerbated and amplified by cursory evaluation practices and poor implementation. 
Evaluations are short and infrequent (most are based on two or fewer classroom observations, each 60 minutes or less), 
conducted by administrators without extensive training, and influenced by powerful cultural forces — in particular, an 
expectation among teachers that they will be among the vast majority rated as top performers. 

While it is impossible to know whether the system drives the culture or the culture the system, the result is clear — 
evaluation systems fail to differentiate performance among teachers. As a result, teacher effectiveness is largely ignored. 
Excellent teachers cannot be recognized or rewarded, chronically low-performing teachers languish, and the wide 
majority of teachers performing at moderate levels do not get the differentiated support and development they need to 
improve as professionals. 
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Reversing the Widget Effect 

The Widget Effect is deeply ingrained in the fundamental systems and policies that govern 
the teachers in our public schools. Better evaluation systems may offer a partial solution, but 
they will not overcome a culture of indifference to classroom effectiveness. Reversing the 
Widget Effect depends on better information about instructional quality that can be used to 
inform other important decisions that dictate who teaches in our schools. 

0 1 I Adopt a comprehensive performance evaluation system that fairly, 
accurately and credibly differentiates teachers based on their effectiveness 
in promoting student achievement. Teachers should be evaluated based on their 
ability to fulfill their core responsibility as professionals — delivering instruction that 
helps students learn and succeed. This demands clear performance standards, multiple 
rating options, regular monitoring of administrator judgments, and frequent feedback 
to teachers. Furthermore, it requires professional development that is tightly linked to 
performance standards and differentiated based on individual teacher needs. 

The core purpose of evaluation must be maximizing teacher growth and effectiveness, 
not just documenting poor performance as a prelude to dismissal. 

02 I Train administrators and other evaluators in the teacher performance 
evaluation system and hold them accountable for using it effectively. 

The differentiation of teacher effectiveness should be a priority for school 
administrators and one for which they are held accountable. Administrators must 
receive rigorous training and ongoing support so that they can make fair and consistent 
assessments of performance against established standards and provide constructive 
feedback and differentiated support to teachers. 
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03 I Integrate the performance evaluation system with critical human capital policies 
and functions such as teacher assignment, professional development, compensation, 
retention and dismissal. Even the best evaluation system will fail if the information it produces 
is of no consequence. An effective evaluation system must be fully integrated with other district 
systems and policies and a primary factor in decisions such as which teachers receive tenure, how 
teachers are assigned and retained, how teachers are compensated and advanced, what professional 
development teachers receive, and when and how teachers are dismissed. Only by attaching stakes 
to evaluation outcomes will teachers and administrators invest in the hard work of creating a truly 
rigorous and credible evaluation system. 

04 I Adopt dismissal policies that provide lower-stakes options for ineffective 
teachers to exit the district and a system of due process that is fair but efficient. 

If the evaluation system is implemented effectively, unsatisfactory ratings will not be anomalous, 
surprising or without clear justification. Likewise, the identification of development areas and the 
provision of support will be continual. As in other professions, teachers who see significant, credible 
evidence of their own failure to meet standards are likely to exit voluntarily. Districts can facilitate 
this process by providing low-stakes options that enable teachers to leave their positions without 
being exiled. For teachers who must be officially dismissed, an expedited, one-day hearing should be 
sufficient for an arbitrator to determine if the evaluation and development process was followed and 
judgments made in good faith. 

Our recommendations outline a comprehensive approach to improving teacher effectiveness and 
maximizing student learning. If implemented thoroughly and faithfully, we believe they will enable districts 
to understand and manage instructional quality with far greater sophistication. Improved evaluation will 
not only benefit students by driving the systematic improvement and growth of their teachers, but teachers 
themselves, by at last treating them as professionals, not parts. 
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THE PROBLEM: TEACHERS AS 
INTERCHANGEABLE PARTS 



Teaching is the essence of education, and there is almost universal agreement 
among researchers that teachers have an outsized impact on student performance. 

We know that improving teacher quality is one of the most powerful ways — if not 
the most powerful way — to create better schools. In fact, a student assigned to a very 
good teacher for a single school year may gain up to a full year’s worth of additional 
academic growth compared to a student assigned to a very poor teacher. Having a series 
of strong or weak teachers in consecutive years compounds the impact. Give high-need 
students three highly effective teachers in a row and they may outperform students 
taught by three ineffective teachers in a row by as much as 50 percentile points/ 

The lesson from these decades of research is clear: teachers matter. Some teachers are 
capable of generating exceptional learning growth in students; others are not, and a 
small group actually hinders their students’ academic progress. 



This simple premise — that teachers matter — has driven The New Teacher Project’s 
prior research and continues to drive our work today Our 2003 report, Mused, Opportunities: 
How We Keep High-Quality Teachers Out of Urban Classrooms, documented how vacancy 
notification policies, rigid staffing rules and late budget timelines caused urban 
districts to hire too late to capture the highest-quality teacher applicants. Our 2005 
report, Unintended Consequences: The Case for Reforming the Staffing Rules in Urban Teachers 
Union Contracts, illustrated how contractual staffing rules, built around the assumption 
that any teacher could fill any vacancy, forced schools to hire teachers they did not 
want and teachers to take positions for which they might not be a good fit. 

Each of these reports in its own way documented a flawed assumption that 
has pervaded American educational policy for decades — the assumption that 
teachers are interchangeable parts. We have come to call this phenomenon the 
Widget Effect. In the presence of the Widget Effect, school systems wrongly 
conflate educational access with educational quality; the only teacher quality goal 
that schools need to achieve is to fill all of their positions. It becomes a foregone 
conclusion that, so long as there is an accredited teacher — any teacher — in front of 
the classroom, students are being served adequately. 

While the Widget Effect pervades many aspects of our education system, it is 
in teacher evaluation that both its architecture and its consequences are most 
immediately apparent. In this report, we examine the central role that the design 
and implementation of teacher evaluation systems play in creating and reinforcing 
the Widget Effect; how teacher and administrator beliefs about evaluation illustrate 
the Widget Effect at work; and how the Widget Effect fuels a policy framework that 
ignores both strong and weak teacher performance. In the absence of meaningful 
performance information, teacher effectiveness is treated as a constant, not a variable, 
and school districts must instead rely on other considerations — many of them 
unrelated to student academic success — to make critical workforce decisions. 



In the 
presence of the 
Widget Effect, 
school systems 
wrongly conflate 
educational 
access with 
educational 
quality. 
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CHARACTERISTICS: 

THE WIDGET EFFECT IN 
TEACHER EVALUATION 



“ Poorly performing teachers 
are rated at the same level 
as the rest of us. This 
infuriates those of us who 
do a good job.” 

-Akron Public Schools Teacher 








The Widget Effect is rooted in the failure of teacher evaluation 
systems to produce meaningful information about teacher 
effectiveness. In theory, an evaluation system should identify 
and measure individual teachers’ strengths and weaknesses 
accurately and consistently, so that teachers get the feedback 
they need to improve their practice and so that schools can 
determine how best to allocate resources and provide support. 
In practice, teacher evaluation systems devalue instructional 
effectiveness by generating performance information that 
reflects virtually no variation among teachers at all. 

This fundamental failing has a deeply insidious effect on teachers 
and schools by institutionalizing indifference when it comes to 
performance. As a result, important variations between teachers 
vanish. Excellence goes unrecognized, development is neglected 
and poor performance goes unaddressed. 

All Teachers Are Rated Good or Great 

The disconnect between teacher evaluation systems and 
actual teacher performance is most strikingly illustrated by the 
wide gap between student outcomes and teacher ratings in 
many districts. Though thousands of teachers included in this 
report teach in schools where high percentages of students 
fail year after year to meet basic academic standards, less than 
one percent of surveyed teachers received a negative rating on 
their most recent evaluation . 4 

This is not to say that responsibility for a failing school rests 
on the shoulders of teachers alone, or that none of these 
teachers demonstrated truly high performance; however, there 
can be no doubt that these ratings dramatically overstate the 
number of exemplary teachers and understate the number 
with moderate and severe performance concerns. These 
data simultaneously obscure poor performance and overlook 
excellence, as the value of superlative teacher ratings is 
rendered meaningless by their overuse. 

To a large degree, teacher evaluation systems codify this 
whitewashing of performance differences, beginning with 
the rating categories themselves. Five of the ten districts in 
this study with available teacher evaluation rating data” use 
a binary rating system for assessing teacher performance; 
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teachers are categorized as either “satisfactory” or FIGURE 02 I Evaluation Ratings for 

“unsatisfactory.”® There are no shades of gray to describe Tenured Teachers in Districts with 

nuances in performance. Multiple- Rating Systems* 



As Figure 01 illustrates, in districts that use binary ratings, 
virtually all tenured' teachers (more than 99 percent) receive 
the satisfactory rating; the number receiving an unsatisfactory 
rating amounts to a fraction of a percentage. In these districts, 
it makes little difference that two ratings are available; in 
practice only one is ever used. 

figure ot I Evaluation Ratings for Tenured Teachers 
in Districts with Binary Rating Systems* 
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10(0.3%) 

3,966 



2 (0.3%) 

660 



DENVER 

PUBLIC SCHOOLS 8 
SY 05-06 to 07-08 
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0 ( 0 %) 
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1,772 



1,105 



SPRINGDALE 
PUBLIC SCHOOLS 

SY 05-06 to 07-08 10 



TOLEDO 

PUBLIC SCHOOLS 
SY 03-04 to 07-08 



Satisfactory Ratings" for equivalent) 

| Unsatisfactory Ratings’ 2 for equivalent) 



One might hope that teacher evaluation systems that employ a 
broader range of rating options would more accurately reflect 
the performance differences among teachers. However, even 
when given multiple ratings from which to choose, evaluators 
in all districts studied rate the majority of teachers in the top 
category, rather than assigning the top rating to only those 
teachers who actually outperform the majority of their peers. 
As illustrated in Figure 02, in the five districts with multiple 
teacher evaluation ratings for which data were available/ 3 
70 percent of tenured teachers still received the highest rating. 73 
Another 24 percent received the second-highest rating. 

While districts using multiple rating systems do show some 
additional variability in teacher evaluation beyond those using 
binary rating systems, districts with four or more ratings still 
assign tenured teachers the lowest two rating options in one 
out of 16 cases. 7 '’ In each case, the basic outcome remains 
true: almost no teachers are identified as delivering 
unsatisfactory instruction. 



AKRON PUBLIC SCHOOLS SY 05-06 to 07-08 

Improvement 

Outstanding Very Good Satisfactory Needed Unsatisfactory 

638 (60.1%) 332(31.3%) 85 (8.0%) 7(0.7%) 0(0.0%) 




CHICAGO PUBLIC SCHOOLS SY 03-04 to 07-08 



Superior Excellent Satisfactory 

25,332 (68.7%) 9,176 (24.9%>) 2,232 (6.1%) 




Unsatisfactory 

149 (0.4%) 



CINCINNATI PUBLIC SCHOOLS SY 03-04 to 07-08 * 

Proficient/ Not Proficient/ 

Distinguished Satisfactory Basic Unsatisfactory 

100 (57.8%) 60 (34.7%) 12 (6.9%) 1(0.6°%) 



k ratings for domain 
"Teaching for Student Learning" 




DISTRICT U-46 (ELGIN) SY 03-04 to 06-07 



Excellent 

2,035 (88.1%) 



Satisfactory 
264 (11.4%) 



Unsatisfactory 

1 1 (0.5%) 



ROCKFORD PUBLIC SCHOOLS SY 03-04 to 07-08 



Excellent 

1,583 (80.2%) 



Satisfactory 
374 (18.9%) 



Unsatisfactory 

18(0.9%) 



*Note: Evaluation rating data in Figures 01 and 02 were collected from each district. 
Data are as accurate as the records provided to TNTP for this study. 
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These data often stand in sharp relief against current levels of student achievement. For example, in 
Denver schools that did not make adequate yearly progress (AYP), more than 98 percent of tenured 
teachers received the highest rating — satisfactory . 16 On average, over the last three years, only 
10 percent 77 of failing schools issued at least one unsatisfactory rating to a tenured teacher. 

figure 03 I Frequency of Unsatisfactory Ratings in 
Denver Public Schools that Did Not Meet AYP'* 




SY 05-06 SY 06-07 SY 07-08 



B&iPi Schools Not Meeting AYP 
™ Schools Not Meeting AYP 
with at Least One Tenured 
Teacher Rated Unsatisfactory 



These findings are consistent with a one year snapshot of data from other districts. Less than 
10 percent of Rockford’s failing schools rated a tenured teacher unsatisfactory in 2007—08, and 
none of Cincinnati’s failing schools did. 

figure 04 I Rockford Public Schools & Cincinnati 
Public Schools AYP Data (SYo/-o8) ,g 







Rockford 
Public Schools 




Cincinnati 
Public Schools 



n Schools Not Meeting AYP 

™ Schools Not Meeting AYP 
with at Least One Tenured 
Teacher Rated Unsatisfactory 



Moreover, it is important to note that performance simply goes untracked for a subset of teachers. 
In some cases, this is systemic. One of the 12 districts studied does not centrally track or record 
any evaluation data at all. However, in many other cases, it reflects the perfunctory nature of the 
evaluation system itself, as 9 percent of teachers surveyed appear to have missed their most recent 
scheduled evaluation. 20 



CHARACTERISTICS 



